Deep architectures for protein contact map prediction
نویسندگان
چکیده
MOTIVATION Residue-residue contact prediction is important for protein structure prediction and other applications. However, the accuracy of current contact predictors often barely exceeds 20% on long-range contacts, falling short of the level required for ab initio structure prediction. RESULTS Here, we develop a novel machine learning approach for contact map prediction using three steps of increasing resolution. First, we use 2D recursive neural networks to predict coarse contacts and orientations between secondary structure elements. Second, we use an energy-based method to align secondary structure elements and predict contact probabilities between residues in contacting alpha-helices or strands. Third, we use a deep neural network architecture to organize and progressively refine the prediction of contacts, integrating information over both space and time. We train the architecture on a large set of non-redundant proteins and test it on a large set of non-homologous domains, as well as on the set of protein domains used for contact prediction in the two most recent CASP8 and CASP9 experiments. For long-range contacts, the accuracy of the new CMAPpro predictor is close to 30%, a significant increase over existing approaches. AVAILABILITY CMAPpro is available as part of the SCRATCH suite at http://scratch.proteomics.ics.uci.edu/. CONTACT [email protected] SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
منابع مشابه
Deep Spatio-Temporal Architectures and Learning for Protein Structure Prediction
Residue-residue contact prediction is a fundamental problem in protein structure prediction. Hower, despite considerable research efforts, contact prediction methods are still largely unreliable. Here we introduce a novel deep machine-learning architecture which consists of a multidimensional stack of learning modules. For contact prediction, the idea is implemented as a three-dimensional stack...
متن کاملPrediction of Contact Maps by Recurrent Neural Network Architectures and Hidden Context Propagation From All Four Cardinal Corners
ABSTRACT Motivation: Accurate prediction of protein contact maps is an important step in computational structural proteomics. Because contact maps provide a translation and rotation invariant topological representation of a protein, they can be used as a fundamental intermediary step in protein structure prediction. Results: We develop a new set of flexible machine learning architectures for th...
متن کاملPrediction of contact maps by GIOHMMs and recurrent neural networks using lateral propagation from all four cardinal corners
MOTIVATION Accurate prediction of protein contact maps is an important step in computational structural proteomics. Because contact maps provide a translation and rotation invariant topological representation of a protein, they can be used as a fundamental intermediary step in protein structure prediction. RESULTS We develop a new set of flexible machine learning architectures for the predict...
متن کاملNew Machine Learning Methods for the Prediction of Protein Topologies
Protein structures are translation and rotation invariant. In protein structure prediction, it is therefore important to be able to assess and predict intermediary topological representations, such as distance or contact maps, that are translation and rotation invariant. Here we develop several new machine learning methods for the prediction and assessment of fine-grained and coarse topological...
متن کاملStriped sheets and protein contact prediction
MOTIVATION Current approaches to contact map prediction in proteins have focused on amino acid conservation and patterns of mutation at sequentially distant positions. This sequence information is poorly understood and very little progress has been made in this area during recent years. RESULTS In this study, an observation of 'striped' sequence patterns across beta-sheets prompted the develo...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Bioinformatics
دوره 28 19 شماره
صفحات -
تاریخ انتشار 2012